A Survey of Techniques for Designing I/O-Efficient Algorithms
نویسندگان
چکیده
This survey is meant to give an introduction to elementary techniques used for designing I/O-efficient algorithms. We do not intend to give a complete survey of all state-of-the-art techniques; but rather we aim to provide the reader with a good understanding of the most elementary techniques. Our focus is on general techniques and on techniques used in the design of I/O-efficient graph algorithms. We include the latter because many abstract data structuring problems can be translated into classical graph problems. While this fact is of mostly philosophical interest in general, it gains importance in I/Oefficient algorithms because random access is penalized in external memory algorithms and standard techniques to extract information from graphs can help when trying to extract information from pointer-based data structures. For the analysis of the I/O-complexity of the algorithms, we adopt the Parallel Disk Model (PDM) (see Chapter 1) as the model of computation. We restrict our discussion to the single-disk case (D = 1) and refer the reader to appropriate references for the case of multiple disks. In order to improve the readability of the text, we do not worry too much about the integrality of parameters that arise in the discussion. That is, we write x/y to denote x/y or x/y , as appropriate. The same applies to expressions such as log x, √ x, etc. We begin our discussion in Section 3.2 with an introduction to two general techniques that are applied in virtually all I/O-efficient algorithms: sorting and scanning. In Section 3.3 we describe a general technique to derive I/Oefficient algorithms from efficient parallel algorithms. Using this technique, the huge repository of efficient parallel algorithms can be exploited to obtain I/O-efficient algorithms for a wide range of problems. Sections 3.4 through 3.7 are dedicated to the discussion of techniques used in I/O-efficient algorithms for fundamental graph problems. The choice of the graph problems we consider is based on the importance of these problems as tools for solving other problems that are not of a graph-theoretic nature.
منابع مشابه
External-Memory Algorithms with Applications in GIS
Preface In the design of algorithms for large-scale applications it is essential to consider the problem of minimizing Input/Output (I/O) communication. Geographical information systems (GIS) are good examples of such large-scale applications as they frequently handle huge amounts of spatial data. In this note we survey the recent developments in external-memory algorithms with applications in ...
متن کاملEfficient Approximation Algorithms for Point-set Diameter in Higher Dimensions
We study the problem of computing the diameter of a set of $n$ points in $d$-dimensional Euclidean space for a fixed dimension $d$, and propose a new $(1+varepsilon)$-approximation algorithm with $O(n+ 1/varepsilon^{d-1})$ time and $O(n)$ space, where $0 < varepsilonleqslant 1$. We also show that the proposed algorithm can be modified to a $(1+O(varepsilon))$-approximation algorithm with $O(n+...
متن کاملNearest Neighbor based Clustering Algorithm for Large Data Sets
Clustering is an unsupervised learning technique in which data or objects are grouped into sets based on some similarity measure. Most of the clustering algorithms assume that the main memory is infinite and can accommodate the set of patterns. In reality many applications give rise to a large set of patterns which does not fit in the main memory. When the data set is too large, much of the dat...
متن کاملEVALUATING EFFICIENCY OF BIG-BANG BIG-CRUNCH ALGORITHM IN BENCHMARK ENGINEERING OPTIMIZATION PROBLEMS
Engineering optimization needs easy-to-use and efficient optimization tools that can be employed for practical purposes. In this context, stochastic search techniques have good reputation and wide acceptability as being powerful tools for solving complex engineering optimization problems. However, increased complexity of some metaheuristic algorithms sometimes makes it difficult for engineers t...
متن کاملAN EFFICIENT METAHEURISTIC ALGORITHM FOR ENGINEERING OPTIMIZATION: SOPT
Metaheuristic algorithms are well-known optimization tools which have been employed for solving a wide range of optimization problems so far. In the present study, a simple optimization (SOPT) algorithm with two main steps namely exploration and exploitation, is provided for practical applications. Aside from a reasonable rate of convergence attained, the ease in its implementation and dependen...
متن کامل